free energy difference
Enhanced Diffusion Sampling: Efficient Rare Event Sampling and Free Energy Calculation with Diffusion Models
Xie, Yu, Winkler, Ludwig, Sun, Lixin, Lewis, Sarah, Foster, Adam E., Luna, José Jiménez, Hempel, Tim, Gastegger, Michael, Chen, Yaoyi, Zaporozhets, Iryna, Clementi, Cecilia, Bishop, Christopher M., Noé, Frank
The rare-event sampling problem has long been the central limiting factor in molecular dynamics (MD), especially in biomolecular simulation. Recently, diffusion models such as BioEmu have emerged as powerful equilibrium samplers that generate independent samples from complex molecular distributions, eliminating the cost of sampling rare transition events. However, a sampling problem remains when computing observables that rely on states which are rare in equilibrium, for example folding free energies. Here, we introduce enhanced diffusion sampling, enabling efficient exploration of rare-event regions while preserving unbiased thermodynamic estimators. The key idea is to perform quantitatively accurate steering protocols to generate biased ensembles and subsequently recover equilibrium statistics via exact reweighting. We instantiate our framework in three algorithms: UmbrellaDiff (umbrella sampling with diffusion models), $Δ$G-Diff (free-energy differences via tilted ensembles), and MetaDiff (a batchwise analogue for metadynamics). Across toy systems, protein folding landscapes and folding free energies, our methods achieve fast, accurate, and scalable estimation of equilibrium properties within GPU-minutes to hours per system -- closing the rare-event sampling gap that remained after the advent of diffusion-model equilibrium samplers.
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Energy (0.68)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York (0.04)
- Europe > Germany > Lower Saxony > Gottingen (0.04)
Refining Machine Learning Potentials through Thermodynamic Theory of Phase Transitions
Foundational Machine Learning Potentials can resolve the accuracy and transferability limitations of classical force fields. They enable microscopic insights into material behavior through Molecular Dynamics simulations, which can crucially expedite material design and discovery. However, insufficiently broad and systematically biased reference data affect the predictive quality of the learned models. Often, these models exhibit significant deviations from experimentally observed phase transition temperatures, in the order of several hundred kelvins. Thus, fine-tuning is necessary to achieve adequate accuracy in many practical problems. This work proposes a fine-tuning strategy via top-down learning, directly correcting the wrongly predicted transition temperatures to match the experimental reference data. Our approach leverages the Differentiable Trajectory Reweighting algorithm to minimize the free energy differences between phases at the experimental target pressures and temperatures. We demonstrate that our approach can accurately correct the phase diagram of pure Titanium in a pressure range of up to 5 GPa, matching the experimental reference within tenths of kelvins and improving the liquid-state diffusion constant. Our approach is model-agnostic, applicable to multi-component systems with solid-solid and solid-liquid transitions, and compliant with top-down training on other experimental properties. Therefore, our approach can serve as an essential step towards highly accurate application-specific and foundational machine learning potentials.
Scalable Boltzmann Generators for equilibrium sampling of large-scale materials
Schebek, Maximilian, Noé, Frank, Rogal, Jutta
The use of generative models to sample equilibrium distributions of many-body systems, as first demonstrated by Boltzmann Generators, has attracted substantial interest due to their ability to produce unbiased and uncorrelated samples in `one shot'. Despite their promise and impressive results across the natural sciences, scaling these models to large systems remains a major challenge. In this work, we introduce a Boltzmann Generator architecture that addresses this scalability bottleneck with a focus on applications in materials science. We leverage augmented coupling flows in combination with graph neural networks to base the generation process on local environmental information, while allowing for energy-based training and fast inference. Compared to previous architectures, our model trains significantly faster, requires far less computational resources, and achieves superior sampling efficiencies. Crucially, the architecture is transferable to larger system sizes, which allows for the efficient sampling of materials with simulation cells of unprecedented size. We demonstrate the potential of our approach by applying it to several materials systems, including Lennard-Jones crystals, ice phases of mW water, and the phase diagram of silicon, for system sizes well above one thousand atoms. The trained Boltzmann Generators produce highly accurate equilibrium ensembles for various crystal structures, as well as Helmholtz and Gibbs free energies across a range of system sizes, able to reach scales where finite-size effects become negligible.
- Europe > Germany > Berlin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Texas > Harris County > Houston (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York (0.04)
- Europe > Germany > Lower Saxony > Gottingen (0.04)
FEAT: Free energy Estimators with Adaptive Transport
He, Jiajun, Du, Yuanqi, Vargas, Francisco, Wang, Yuanqing, Gomes, Carla P., Hernández-Lobato, José Miguel, Vanden-Eijnden, Eric
FEA T leverages learned transports implemented via stochastic interpolants and provides consistent, minimum-variance estimators based on escorted Jarzyn-ski equality and controlled Crooks theorem, alongside variational upper and lower bounds on free energy differences. Unifying equilibrium and non-equilibrium methods under a single theoretical framework, FEA T establishes a principled foundation for neural free energy calculations. Experimental validation on toy examples, molecular simulations, and quantum field theory demonstrates improvements over existing learning-based methods. 1 Introduction Estimating free energy is fundamental across machine learning (appearing as normalization factors and the model evidence), statistical mechanics (partition functions), chemistry, and biology (Chipot and Pohorille, 2007; Leli ` evre et al., 2010; Tuckerman, 2023). The free energy is expressed as: F = k BT log Z, Z = null Ωexp( βU (x))dx (1) where Ω R d, U: Ω R is the energy function, assumed to be such that Z <, and β = 1 /k BT combines the Boltzmann constant k B and temperature T . Rather than calculating F directly, one typically estimates the free energy difference between systems (or states) S a and S b with energies U a and U b, which is essential for biological conformational changes, ligand-macromolecule binding, and chemical reaction mechanisms (Wang et al., 2015): F = F b F a = k BT log Z b Z a (2) This computational challenge has driven numerous approaches. Zwanzig (1954) reformulated the problem as importance sampling, where one system serves as the proposal, enabling free energy difference estimation via Monte Carlo sampling. This free energy perturbation (FEP) method, however, suffers from high variance when the energies U a and U b of systems S a and S b differ significantly, particularly in high-dimensional spaces. The authors contributed equally to this work. The order is randomly assigned and will be randomly reshuffled in each version of the paper to reflect this equal contribution.
- North America > United States (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Accurate and thermodynamically consistent hydrogen equation of state for planetary modeling with flow matching
Xie, Hao, Howard, Saburo, Mazzola, Guglielmo
Accurate determination of the equation of state of dense hydrogen is essential for understanding gas giants. Currently, there is still no consensus on methods for calculating its entropy, which play a fundamental role and can result in qualitatively different predictions for Jupiter's interior. Here, we investigate various aspects of entropy calculation for dense hydrogen based on ab initio molecular dynamics simulations. Specifically, we employ the recently developed flow matching method to validate the accuracy of the traditional thermodynamic integration approach. We then clearly identify pitfalls in previous attempts and propose a reliable framework for constructing the hydrogen equation of state, which is accurate and thermodynamically consistent across a wide range of temperature and pressure conditions. This allows us to conclusively address the long-standing discrepancies in Jupiter's adiabat among earlier studies, demonstrating the potential of our approach for providing reliable equations of state of diverse materials.
- North America > United States (0.46)
- Europe > Switzerland (0.28)
A survey of probabilistic generative frameworks for molecular simulations
John, Richard, Herron, Lukas, Tiwary, Pratyush
Generative artificial intelligence is now a widely used tool in molecular science. Despite the popularity of probabilistic generative models, numerical experiments benchmarking their performance on molecular data are lacking. In this work, we introduce and explain several classes of generative models, broadly sorted into two categories: flow-based models and diffusion models. We select three representative models: Neural Spline Flows, Conditional Flow Matching, and Denoising Diffusion Probabilistic Models, and examine their accuracy, computational cost, and generation speed across datasets with tunable dimensionality, complexity, and modal asymmetry. Our findings are varied, with no one framework being the best for all purposes. In a nutshell, (i) Neural Spline Flows do best at capturing mode asymmetry present in low-dimensional data, (ii) Conditional Flow Matching outperforms other models for high-dimensional data with low complexity, and (iii) Denoising Diffusion Probabilistic Models appears the best for low-dimensional data with high complexity. Our datasets include a Gaussian mixture model and the dihedral torsion angle distribution of the Aib\textsubscript{9} peptide, generated via a molecular dynamics simulation. We hope our taxonomy of probabilistic generative frameworks and numerical results may guide model selection for a wide range of molecular tasks.
- North America > United States > Maryland > Prince George's County > College Park (0.15)
- North America > United States > Maryland > Montgomery County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Graph Neural Network-State Predictive Information Bottleneck (GNN-SPIB) approach for learning molecular thermodynamics and kinetics
Zou, Ziyue, Wang, Dedi, Tiwary, Pratyush
Molecular dynamics simulations offer detailed insights into atomic motions but face timescale limitations. Enhanced sampling methods have addressed these challenges but even with machine learning, they often rely on pre-selected expert-based features. In this work, we present the Graph Neural Network-State Predictive Information Bottleneck (GNN-SPIB) framework, which combines graph neural networks and the State Predictive Information Bottleneck to automatically learn low-dimensional representations directly from atomic coordinates. Tested on three benchmark systems, our approach predicts essential structural, thermodynamic and kinetic information for slow processes, demonstrating robustness across diverse systems. The method shows promise for complex systems, enabling effective enhanced sampling without requiring pre-defined reaction coordinates or input features.
- North America > United States > Maryland > Prince George's County > College Park (0.05)
- North America > United States > Maryland > Montgomery County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (2 more...)
- Research Report > Experimental Study (0.48)
- Research Report > New Finding (0.46)
Efficient mapping of phase diagrams with conditional normalizing flows
Schebek, Maximilian, Invernizzi, Michele, Noé, Frank, Rogal, Jutta
The accurate prediction of phase diagrams is of central importance for both the fundamental understanding of materials as well as for technological applications in material sciences. However, the computational prediction of the relative stability between phases based on their free energy is a daunting task, as traditional free energy estimators require a large amount of simulation data to obtain uncorrelated equilibrium samples over a grid of thermodynamic states. In this work, we develop deep generative machine learning models for entire phase diagrams, employing normalizing flows conditioned on the thermodynamic states, e.g., temperature and pressure, that they map to. By training a single normalizing flow to transform the equilibrium distribution sampled at only one reference thermodynamic state to a wide range of target temperatures and pressures, we can efficiently generate equilibrium samples across the entire phase diagram. Using a permutation-equivariant architecture allows us, thereby, to treat solid and liquid phases on the same footing. We demonstrate our approach by predicting the solid-liquid coexistence line for a Lennard-Jones system in excellent agreement with state-of-the-art free energy methods while significantly reducing the number of energy evaluations needed.
- Europe > Germany > Berlin (0.04)
- North America > United States > Texas > Harris County > Houston (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)